Superscalar instruction issue
نویسنده
چکیده
learly, instruction issue and execution are closely related: The more parallel the instruction execution, the higher the requirements for the parallelism of instruction issue. Thus, we see the continuous and harmonized increase of parallelism in instruction issue and execution. This article focuses on superscalar instruction issue, tracing the way parallel instruction execution and issue have increased performance. It also spans the design space of instruction issue, identifying important design aspects and available design choices. The article also demonstrates a concise way to represent the design space using DS trees (see the related box), reviews the most frequently used issue schemes, and highlights trends for each design aspect of instruction issue. Von Neumann processors evolved by and large in two respects. One reflects the technological improvements, which are capped by increasing clock rates. The second is the functional evolution of processors that came about primarily by raising the degree of par-allelism in internal operations—first of the issue and instruction execution. Processor function evolved in three consecutive phases. First were the traditional von Neumann processors, which are characterized by both sequential issue and sequential instruction execution, as depicted in Figure 1. The chase for more performance then gave rise to the introduction of parallel instruction execution. Designers introduced parallel instruction execution using one of two orthogonal concepts: multiple (non-pipelined) execution units (execution units) or pipelining. As a result, instruction-level parallel (ILP) processors emerged. Because early ILP processors used sequential instruction issue, processors arriving in the second phase of the evolution were scalar ILP processors. Subsequently, the degree of parallel execution rose even further through use of multiple pipelined execution units. While increasing the execution parallelism, designers soon reached the point where sequential instruction issue could no Before choosing parallel instruction issue to increase performance, we must identify the important design aspects and choices. DS trees can help to concisely represent the design space of instruction issue. Traditional von Neuman processors (sequential issue, sequential execution) Scalar ILP processors (sequential issue, parallel execution) Parallelism of instruction execution Parallelism of instruction issue Superscalar ILP processors (parallel issue, parallel execution) Nonpipelined processors Typical implementation Processors with multiple nonpipelined execution units, or pipelined processors VLIW and superscalar procssors embodying multiple pipelined execution units Processor performance Figure 1. Basic evolution phases of von Neumann processors. Thin portions of arrows indicate sequential operation; bold portions indicate parallel operation.
منابع مشابه
An Exploration Of Instruction Fetch Requirement In Out-of-order Superscalar Processors
Automated design of superscalar processors can provide future in terms a cycles-per-instruction (CPI) using the application program statistics and the 124, Optimization of Instruction Fetch Mechanisms for High Issue Rates 117, A first-order superscalar processor model Karkhanis, Smith 2004 (Show Context). Because superscalar architectures include complicated control logic for out-of-order execu...
متن کاملSimultaneous Multithreading – Blending Thread-level and Instruction-level Parallelism in Advanced Microprocessors
The paper discusses the reasons and possibilities of exploiting thread-level parallelism in modern microprocessors. The performance of a superscalar processor suffers when instruction-level parallelism is low. The underutilization due to missing instruction-level parallelism can be overcome by simultaneous multithreading, where a processor can issue multiple instructions from multiple threads e...
متن کاملSimplifying Hardware for Out Of Order Execution using the Decoupling Paradigm
Future hardware and software technology will try to provide improved performance by extracting higher levels of parallelism. However the cost of a main memory access-in terms of missed instruction issue slots-increases with faster processors and greater issue widths. For this reason latency hiding technology remains one of the most important parts of high performance processor designs. In this ...
متن کاملOptimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors
Modern superscalar and VLIW processors fetch, decode, issue, execute, and retire multiple instructions per cycle. By taking advantage of instruction-level parallelism (ILP), processor performance can be improved substantially. However, increasing the level of ILP may eventually result in diminishing and negative returns due to control and data dependencies among subsequent instructions as well ...
متن کاملDesign of Instruction Address Queue for High Degree X86 Superscalar Architecture
A major hurdle of recent x86 superscalar processor designs is limited instruction issue rate due to the overly complex x86 instruction formats. To alleviate this problem, the machine states must be preserved and the instruction address routing paths must be simplified. We propose an instruction address queue, whose queue size has been estimated to handle saving of instruction addresses with thr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Micro
دوره 17 شماره
صفحات -
تاریخ انتشار 1997